A Multimodal Approach to Automatic Geo-Tagging of Video
نویسندگان
چکیده
Geo-tags provide an essential support for organizing and retrieving the rapidly growing online video contents captured by users and shared online. Videos present an unique opportunity for automatic geo-tagging as they combine multiple information sources, i.e., textual metadata, visual and audio cues. This report highlights various approaches (data-driven, semantic technology-based, and graphical model-based) to predict the geo-location of online videos. The algorithms make use of each or combinations of textual, visual and audio information sources. All experiments were performed with a geo-coordinate prediction benchmarking corpus containing 10,438 videos. The performance of these algorithm is analyzed, revealing that the textual metadata is particularly more useful than visual or audio contents, but the combination of multiple cues shows better overall performance. The report concludes with a discussion of the impact that the improvement of geo-coordinate prediction will have and the challenges that remain open for future research.
منابع مشابه
Suggestive Geo-Tagging Assistance for Geo-Collaboration Tools
An argumentation map is an online discussion forum for spatially related topics that combines the forum with an interactive map. The utility of an argumentation mapping tool highly depends on the accuracy and quantity of the geo-tags that link the discussion contributions to geographic locations. These geo-tags can be created manually by the users of the argumentation map or automatically by a ...
متن کاملAchieving Multimodal Cohesion during Intercultural Conversations
How do English as a lingua franca (ELF) speakers achieve multimodal cohesion on the basis of their specific interests and cultural backgrounds? From a dialogic and collaborative view of communication, this study focuses on how verbal and nonverbal modes cohere together during intercultural conversations. The data include approximately 160-minute transcribed video recordings of ELF interactions ...
متن کاملMultimodal Automatic Tagging of Music Titles using Aggregation of Estimators
This paper presents the participation to the MusiClef 2012 Multimodal Music Tagging task. It expounds the approach that consists of an aggregation of estimators as a procedure to combine different sources of information.
متن کاملHow Spatial Segmentation improves the Multimodal Geo-Tagging
In this paper we present a hierarchical, multi-modal approach in combination with different granularity levels for the Placing Task at the MediaEval benchmark 2012. Our approach makes use of external resources like gazetteers to extract toponyms in the metadata and of visual and textual features to identify similar content. First, the bounderies detection recognizes the country and its dimensio...
متن کاملTowards an intelligent framework for multimodal affective data analysis
An increasingly large amount of multimodal content is posted on social media websites such as YouTube and Facebook everyday. In order to cope with the growth of such so much multimodal data, there is an urgent need to develop an intelligent multi-modal analysis framework that can effectively extract information from multiple modalities. In this paper, we propose a novel multimodal information e...
متن کامل